Improving Chinese Named Entity Recognition by Interactive Fusion of Contextual Representation and Glyph Representation

نویسندگان

چکیده

Named entity recognition (NER) is a fundamental task in natural language processing. In Chinese NER, additional resources such as lexicons, syntactic features and knowledge graphs are usually introduced to improve the performance of model. However, characters evolved from pictographs, their glyphs contain rich semantic information, which often ignored. Therefore, order make full use information contained character glyphs, we propose NER model that combines contextual representation glyph representation, named CGR-NER (Character–Glyph Representation for NER). First, uses large-scale pre-trained dynamically generate representations characters. Secondly, hybrid neural network combining three-dimensional convolutional (3DCNN) bi-directional long short-term memory (BiLSTM) designed extract glyph, potential word formation between adjacent global dependency sequence. Thirdly, an interactive fusion method with crossmodal attention gate mechanism proposed fuse different models dynamically. The experimental results show our achieves 82.97% 70.70% F1 scores on OntoNotes 4 Weibo datasets. Multiple ablation studies also verify advantages effectiveness

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Named Entity Recognition for Chinese Social Media with Word Segmentation Representation Learning

Named entity recognition, and other information extraction tasks, frequently use linguistic features such as part of speech tags or chunkings. For languages where word boundaries are not readily identified in text, word segmentation is a key first step to generating features for an NER system. While using word boundary tags as features are helpful, the signals that aid in identifying these boun...

متن کامل

Effective Word Representation for Named Entity Recognition

Recently, various machine learning models have been built using word-level embeddings and have achieved substantial improvement in NER prediction accuracy. Most NER models only take words as input and ignore character-level information. In this paper, we propose an effective word representation that efficiently includes both the word-level and character-level information by averaging its charac...

متن کامل

Learning Entity Representation for Named Entity Disambiguation

In this paper we present a novel disambiguation model, based on neural networks. Most existing studies focus on designing effective man-made features and complicated similarity measures to obtain better disambiguation performance. Instead, our method learns distributed representation of entity to measure similarity without man-made features. Entity representation consists of context document re...

متن کامل

Evaluating Word Representation Features in Biomedical Named Entity Recognition Tasks

Biomedical Named Entity Recognition (BNER), which extracts important entities such as genes and proteins, is a crucial step of natural language processing in the biomedical domain. Various machine learning-based approaches have been applied to BNER tasks and showed good performance. In this paper, we systematically investigated three different types of word representation (WR) features for BNER...

متن کامل

Hallym: Named Entity Recognition on Twitter with Induced Word Representation

Twitter is a type of social media that contains diverse user-generated texts. Traditional models are not applicable to tweet data because the text style is not as grammaticalized as that of newswire. In this paper, we construct word embeddings via canonical correlation analysis (CCA) on a considerable amount of tweet data and show the efficacy of word representation. Besides word embedding, we ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied sciences

سال: 2023

ISSN: ['2076-3417']

DOI: https://doi.org/10.3390/app13074299